A Bayesian hybrid Huberized support vector machine and its applications in high-dimensional medical data

نویسندگان

  • Sounak Chakraborty
  • Ruixin Guo
چکیده

The hybrid Huberized support vector machine (HHSVM) with the elastic-net penalty has been developed for cancer tumor classification based on thousands of gene expression measurements. In this paper, we develop a Bayesian formulation of the hybrid Huberized support vector machine for binary classification. For the coefficients of linear classification boundary, we propose a new type of prior, which can select variables and group them together simultaneously. Our proposed prior is a scale mixture of normal distributions and independent gamma priors on a transformation of its variance. We establish a direct connection between the Bayesian HHSVM model with our special prior and the standard HHSVM solution with elastic-net penalty. We propose a hierarchical Bayes and an empirical Bayes technique to select the penalty parameter. In the hierarchical Bayes model, the penalty parameter is selected using a Beta prior. For the empirical Bayes model, we estimate the penalty parameter by maximizing the marginal likelihood. The proposed model is applied on two simulated data sets and three real life gene expression microarray data sets. Results suggest that our Bayesian models are highly successful in selecting similarly behaved important genes in groups and predict the cancer class. Most of the genes selected by our models has shown strong association with well studied genetic pathways, further validating our claim.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Department of Statistics University of Missouri - Columbia TR - MU - STAT - 2009 - 09 - 07

Support vector machine (SVM) has been successfully applied for cancer tumor classification based on thousands of gene expression measurements. A modification of SVM known as hybrid Huberized support vector machine (HHSVM) has been developed for the same purpose along with an in built gene selection mechanism with the help of elastic-net penalty. In this paper we develop a Bayesian formulation o...

متن کامل

PREDICTION OF SLOPE STABILITY STATE FOR CIRCULAR FAILURE: A HYBRID SUPPORT VECTOR MACHINE WITH HARMONY SEARCH ALGORITHM

The slope stability analysis is routinely performed by engineers to estimate the stability of river training works, road embankments, embankment dams, excavations and retaining walls. This paper presents a new approach to build a model for the prediction of slope stability state. The support vector machine (SVM) is a new machine learning method based on statistical learning theory, which can so...

متن کامل

Mammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease

Background and purpose: Machine learning is a class of modern and strong tools that can solve many important problems that nowadays humans may be faced with. Support vector regression (SVR) is a way to build a regression model which is an incredible member of the machine learning family. SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning appr...

متن کامل

Automatic Interpretation of UltraCam Imagery by Combination of Support Vector Machine and Knowledge-based Systems

With the development of digital sensors, an increasing number of high-resolution images are available. Interpretation of these images is not possible manually, which necessitates seeking for practical, fast and automatic solutions to solve the environmental and location-based management problems. The land cover classification using high-resolution imagery is a difficult process because of the c...

متن کامل

Hybrid huberized support vector machines for microarray classification and gene selection

MOTIVATION The standard L(2)-norm support vector machine (SVM) is a widely used tool for microarray classification. Previous studies have demonstrated its superior performance in terms of classification accuracy. However, a major limitation of the SVM is that it cannot automatically select relevant genes for the classification. The L(1)-norm SVM is a variant of the standard L(2)-norm SVM, that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 55  شماره 

صفحات  -

تاریخ انتشار 2011